[RISCV] Convert LWU to LW if possible in RISCVOptWInstrs #144703

asb · 2025-06-18T13:47:19Z

The original version of this patch handled LWU => LW and LHU => LH. This revised version limits it just to LWU, which is better motivated (due to reducing diff vs RV32 and being more compressible).

This is currently implemented as part of RISCVOptWInstrs in order to reuse hasAllNbitUsers. However a new home or further refactoring will be needed (see the end of this note).

Why prefer sign-extended loads?

LW is compressible while LWU is not.
Helps to minimise the diff vs RV32 (e.g. LWU vs LW)
Helps to minimise distracting diffs vs GCC. I see this come up frequently when comparing GCC code and in these cases it's a red herring.

Issues or open questions with this patch as stands:

Doing something at the MI level makes sense as a last resort. I wonder if for some of the cases we could be producing a sign-extended load earlier on.
RISCVOptWInstrs is a slightly awkward home. It's currently not run for RV32, which means the LHU changes can actually add new diffs vs RV32. Potentially just this one transformation could be done for RV32.
Do we want to perform additional load narrowing? With the existing code, an LD will be narrowed to LW if only the lower bits are needed. The example that made look at extending load normalisation in the first place was an LWU that immediately has a small mask applied, which could be narrowed to LB. It's not clear this has any real benefit.
As I've put the specific home of this change as an open question, I haven't added MIR tests for RISCVOptWInstrs. I'll add that if we decide this is the right home.

The first version of the patch that included LBU->LB conversion changed 95k instructions across a compile of llvm-test-suite (including SPEC 2017). As was pointed out, Zcb has C_LH and C_LHU but only C_LBU (and there's also no C_LB in the baseline C). Therefore, we avoid converting C_LBU to C_LB as it i less compressible. This leaves us with ~36k instructions changed across a compile of llvm-test-suite.

This is currently implemented as part of RISCVOptWInstrs in order to reuse hasAllNbitUsers. However a new home or further refactoring will be needed (see the end of this note). Why prefer sign-extended loads? * Sign-extending loads are more compressible. There is no compressed LWU, and LBU and LHU are only available in Zcb. * Helps to minimise the diff vs RV32 (e.g. LWU vs LW) * Helps to minimise distracting diffs vs GCC. I see this come up frequently when comparing GCC code and in these cases it's a red herring. Issues or open questions with this patch as stands: * Doing something at the MI level makes sense as a last resort. I wonder if for some of the cases we could be producing a sign-extended load earlier on. * RISCVOptWInstrs is a slightly awkward home. It's currently not run for RV32, which means the LBU/LHU changes can actually add new diffs vs RV32. Potentially just this one transformation could be done for RV32. * Do we want to perform additional load narrowing? With the existing code, an LD will be narrowed to LW if only the lower bits are needed. The example that made look at extending load normalisation in the first place was an LWU that immediately has a small mask applied, which could be narrowed to LB. It's not clear this has any real benefit. This patch changes 95k instructions across a compile of llvm-test-suite (including SPEC 2017), and all tests complete successfully afterwards.

llvmbot · 2025-06-18T13:47:54Z

@llvm/pr-subscribers-backend-risc-v

Author: Alex Bradbury (asb)

Changes

This is currently implemented as part of RISCVOptWInstrs in order to reuse hasAllNbitUsers. However a new home or further refactoring will be needed (see the end of this note).

Why prefer sign-extended loads?

Sign-extending loads are more compressible. There is no compressed LWU, and LBU and LHU are only available in Zcb.
Helps to minimise the diff vs RV32 (e.g. LWU vs LW)
Helps to minimise distracting diffs vs GCC. I see this come up frequently when comparing GCC code and in these cases it's a red herring.

Issues or open questions with this patch as stands:

Doing something at the MI level makes sense as a last resort. I wonder if for some of the cases we could be producing a sign-extended load earlier on.
RISCVOptWInstrs is a slightly awkward home. It's currently not run for RV32, which means the LBU/LHU changes can actually add new diffs vs RV32. Potentially just this one transformation could be done for RV32.
Do we want to perform additional load narrowing? With the existing code, an LD will be narrowed to LW if only the lower bits are needed. The example that made look at extending load normalisation in the first place was an LWU that immediately has a small mask applied, which could be narrowed to LB. It's not clear this has any real benefit.

This patch changes 95k instructions across a compile of llvm-test-suite (including SPEC 2017), and all tests complete successfully afterwards.

Patch is 468.27 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/144703.diff

58 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp (+45)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll (+5-11)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll (+5-11)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll (+96-96)
(modified) llvm/test/CodeGen/RISCV/atomic-signext.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/bf16-promote.ll (+30-15)
(modified) llvm/test/CodeGen/RISCV/bfloat-convert.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/bfloat.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/ctz_zero_return_test.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/double-convert-strict.ll (+6-12)
(modified) llvm/test/CodeGen/RISCV/double-convert.ll (+6-12)
(modified) llvm/test/CodeGen/RISCV/float-convert-strict.ll (+10-22)
(modified) llvm/test/CodeGen/RISCV/float-convert.ll (+10-22)
(modified) llvm/test/CodeGen/RISCV/fold-mem-offset.ll (+132-65)
(modified) llvm/test/CodeGen/RISCV/half-arith.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/half-convert-strict.ll (+15-27)
(modified) llvm/test/CodeGen/RISCV/half-convert.ll (+21-39)
(modified) llvm/test/CodeGen/RISCV/hoist-global-addr-base.ll (+15-7)
(modified) llvm/test/CodeGen/RISCV/local-stack-slot-allocation.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/mem64.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/memcmp-optsize.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/memcmp.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/memcpy-inline.ll (+94-94)
(modified) llvm/test/CodeGen/RISCV/memcpy.ll (+32-32)
(modified) llvm/test/CodeGen/RISCV/memmove.ll (+35-35)
(modified) llvm/test/CodeGen/RISCV/nontemporal.ll (+160-160)
(modified) llvm/test/CodeGen/RISCV/prefer-w-inst.mir (+2-2)
(modified) llvm/test/CodeGen/RISCV/rv64zbb.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rv64zbkb.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rvv/expandload.ll (+512-512)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vector-i8-index-cornercase.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll (+33-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll (+97-97)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-vrgather.ll (+32-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-mask-load-store.ll (-23)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+87-87)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int.ll (+17-8)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-vpload.ll (+2-7)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-unaligned.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwaddu.ll (+31-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwmulsu.ll (+31-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwmulu.ll (-14)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwsubu.ll (+35-17)
(modified) llvm/test/CodeGen/RISCV/rvv/memcpy-inline.ll (+13-13)
(modified) llvm/test/CodeGen/RISCV/rvv/stores-of-loads-merging.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/rvv/strided-vpload.ll (+15-6)
(modified) llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/stack-clash-prologue.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/unaligned-load-store.ll (+13-13)
(modified) llvm/test/CodeGen/RISCV/urem-seteq-illegal-types.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/urem-vector-lkk.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-by-byte-multiple-legalization.ll (+132-132)
(modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-legalization.ll (+72-72)
(modified) llvm/test/CodeGen/RISCV/zdinx-boundary-check.ll (+3-3)

diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index ed61236415ccf..a141bae55b70f 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -71,6 +71,8 @@ class RISCVOptWInstrs : public MachineFunctionPass {
                       const RISCVSubtarget &ST, MachineRegisterInfo &MRI);
   bool appendWSuffixes(MachineFunction &MF, const RISCVInstrInfo &TII,
                        const RISCVSubtarget &ST, MachineRegisterInfo &MRI);
+  bool convertZExtLoads(MachineFunction &MF, const RISCVInstrInfo &TII,
+                       const RISCVSubtarget &ST, MachineRegisterInfo &MRI);
 
   void getAnalysisUsage(AnalysisUsage &AU) const override {
     AU.setPreservesCFG();
@@ -788,6 +790,47 @@ bool RISCVOptWInstrs::appendWSuffixes(MachineFunction &MF,
   return MadeChange;
 }
 
+bool RISCVOptWInstrs::convertZExtLoads(MachineFunction &MF,
+                                      const RISCVInstrInfo &TII,
+                                      const RISCVSubtarget &ST,
+                                      MachineRegisterInfo &MRI) {
+  bool MadeChange = false;
+  for (MachineBasicBlock &MBB : MF) {
+    for (MachineInstr &MI : MBB) {
+      unsigned WOpc;
+      int UsersWidth;
+      switch (MI.getOpcode()) {
+      default:
+        continue;
+      case RISCV::LBU:
+        WOpc = RISCV::LB;
+        UsersWidth = 8;
+        break;
+      case RISCV::LHU:
+        WOpc = RISCV::LH;
+        UsersWidth = 16;
+        break;
+      case RISCV::LWU:
+        WOpc = RISCV::LW;
+        UsersWidth = 32;
+        break;
+      }
+
+      if (hasAllNBitUsers(MI, ST, MRI, UsersWidth)) {
+        LLVM_DEBUG(dbgs() << "Replacing " << MI);
+        MI.setDesc(TII.get(WOpc));
+        MI.clearFlag(MachineInstr::MIFlag::NoSWrap);
+        MI.clearFlag(MachineInstr::MIFlag::NoUWrap);
+        MI.clearFlag(MachineInstr::MIFlag::IsExact);
+        LLVM_DEBUG(dbgs() << "     with " << MI);
+        MadeChange = true;
+      }
+    }
+  }
+
+  return MadeChange;
+}
+
 bool RISCVOptWInstrs::runOnMachineFunction(MachineFunction &MF) {
   if (skipFunction(MF.getFunction()))
     return false;
@@ -808,5 +851,7 @@ bool RISCVOptWInstrs::runOnMachineFunction(MachineFunction &MF) {
   if (ST.preferWInst())
     MadeChange |= appendWSuffixes(MF, TII, ST, MRI);
 
+  MadeChange |= convertZExtLoads(MF, TII, ST, MRI);
+
   return MadeChange;
 }
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll b/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll
index a49e94f4bc910..620c5ecc6c1e7 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll
@@ -246,17 +246,11 @@ define double @fcvt_d_wu(i32 %a) nounwind {
 }
 
 define double @fcvt_d_wu_load(ptr %p) nounwind {
-; RV32IFD-LABEL: fcvt_d_wu_load:
-; RV32IFD:       # %bb.0:
-; RV32IFD-NEXT:    lw a0, 0(a0)
-; RV32IFD-NEXT:    fcvt.d.wu fa0, a0
-; RV32IFD-NEXT:    ret
-;
-; RV64IFD-LABEL: fcvt_d_wu_load:
-; RV64IFD:       # %bb.0:
-; RV64IFD-NEXT:    lwu a0, 0(a0)
-; RV64IFD-NEXT:    fcvt.d.wu fa0, a0
-; RV64IFD-NEXT:    ret
+; CHECKIFD-LABEL: fcvt_d_wu_load:
+; CHECKIFD:       # %bb.0:
+; CHECKIFD-NEXT:    lw a0, 0(a0)
+; CHECKIFD-NEXT:    fcvt.d.wu fa0, a0
+; CHECKIFD-NEXT:    ret
 ;
 ; RV32I-LABEL: fcvt_d_wu_load:
 ; RV32I:       # %bb.0:
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll b/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll
index fa093623dd6f8..bbea7929a304e 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll
@@ -232,17 +232,11 @@ define float @fcvt_s_wu(i32 %a) nounwind {
 }
 
 define float @fcvt_s_wu_load(ptr %p) nounwind {
-; RV32IF-LABEL: fcvt_s_wu_load:
-; RV32IF:       # %bb.0:
-; RV32IF-NEXT:    lw a0, 0(a0)
-; RV32IF-NEXT:    fcvt.s.wu fa0, a0
-; RV32IF-NEXT:    ret
-;
-; RV64IF-LABEL: fcvt_s_wu_load:
-; RV64IF:       # %bb.0:
-; RV64IF-NEXT:    lwu a0, 0(a0)
-; RV64IF-NEXT:    fcvt.s.wu fa0, a0
-; RV64IF-NEXT:    ret
+; CHECKIF-LABEL: fcvt_s_wu_load:
+; CHECKIF:       # %bb.0:
+; CHECKIF-NEXT:    lw a0, 0(a0)
+; CHECKIF-NEXT:    fcvt.s.wu fa0, a0
+; CHECKIF-NEXT:    ret
 ;
 ; RV32I-LABEL: fcvt_s_wu_load:
 ; RV32I:       # %bb.0:
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll
index 9690302552090..65838f51fc920 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll
@@ -748,7 +748,7 @@ define signext i32 @ctpop_i32_load(ptr %p) nounwind {
 ;
 ; RV64ZBB-LABEL: ctpop_i32_load:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    lwu a0, 0(a0)
+; RV64ZBB-NEXT:    lw a0, 0(a0)
 ; RV64ZBB-NEXT:    cpopw a0, a0
 ; RV64ZBB-NEXT:    ret
   %a = load i32, ptr %p
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll
index cd59c9e01806d..ba058ca0b500a 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll
@@ -114,7 +114,7 @@ define i64 @pack_i64_2(i32 signext %a, i32 signext %b) nounwind {
 define i64 @pack_i64_3(ptr %0, ptr %1) {
 ; RV64I-LABEL: pack_i64_3:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    lwu a0, 0(a0)
+; RV64I-NEXT:    lw a0, 0(a0)
 ; RV64I-NEXT:    lwu a1, 0(a1)
 ; RV64I-NEXT:    slli a0, a0, 32
 ; RV64I-NEXT:    or a0, a0, a1
@@ -122,8 +122,8 @@ define i64 @pack_i64_3(ptr %0, ptr %1) {
 ;
 ; RV64ZBKB-LABEL: pack_i64_3:
 ; RV64ZBKB:       # %bb.0:
-; RV64ZBKB-NEXT:    lwu a0, 0(a0)
-; RV64ZBKB-NEXT:    lwu a1, 0(a1)
+; RV64ZBKB-NEXT:    lw a0, 0(a0)
+; RV64ZBKB-NEXT:    lw a1, 0(a1)
 ; RV64ZBKB-NEXT:    pack a0, a1, a0
 ; RV64ZBKB-NEXT:    ret
   %3 = load i32, ptr %0, align 4
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll b/llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll
index 69519c00f88ea..27c6d0240f987 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll
@@ -8,13 +8,13 @@ define void @lshr_4bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a3, 1(a0)
 ; RV64I-NEXT:    lbu a4, 0(a0)
 ; RV64I-NEXT:    lbu a5, 2(a0)
-; RV64I-NEXT:    lbu a0, 3(a0)
+; RV64I-NEXT:    lb a0, 3(a0)
 ; RV64I-NEXT:    slli a3, a3, 8
 ; RV64I-NEXT:    or a3, a3, a4
 ; RV64I-NEXT:    lbu a4, 0(a1)
 ; RV64I-NEXT:    lbu a6, 1(a1)
 ; RV64I-NEXT:    lbu a7, 2(a1)
-; RV64I-NEXT:    lbu a1, 3(a1)
+; RV64I-NEXT:    lb a1, 3(a1)
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    or a0, a0, a5
 ; RV64I-NEXT:    slli a6, a6, 8
@@ -85,13 +85,13 @@ define void @shl_4bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a3, 1(a0)
 ; RV64I-NEXT:    lbu a4, 0(a0)
 ; RV64I-NEXT:    lbu a5, 2(a0)
-; RV64I-NEXT:    lbu a0, 3(a0)
+; RV64I-NEXT:    lb a0, 3(a0)
 ; RV64I-NEXT:    slli a3, a3, 8
 ; RV64I-NEXT:    or a3, a3, a4
 ; RV64I-NEXT:    lbu a4, 0(a1)
 ; RV64I-NEXT:    lbu a6, 1(a1)
 ; RV64I-NEXT:    lbu a7, 2(a1)
-; RV64I-NEXT:    lbu a1, 3(a1)
+; RV64I-NEXT:    lb a1, 3(a1)
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    or a0, a0, a5
 ; RV64I-NEXT:    slli a6, a6, 8
@@ -162,13 +162,13 @@ define void @ashr_4bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a3, 1(a0)
 ; RV64I-NEXT:    lbu a4, 0(a0)
 ; RV64I-NEXT:    lbu a5, 2(a0)
-; RV64I-NEXT:    lbu a0, 3(a0)
+; RV64I-NEXT:    lb a0, 3(a0)
 ; RV64I-NEXT:    slli a3, a3, 8
 ; RV64I-NEXT:    or a3, a3, a4
 ; RV64I-NEXT:    lbu a4, 0(a1)
 ; RV64I-NEXT:    lbu a6, 1(a1)
 ; RV64I-NEXT:    lbu a7, 2(a1)
-; RV64I-NEXT:    lbu a1, 3(a1)
+; RV64I-NEXT:    lb a1, 3(a1)
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    or a0, a0, a5
 ; RV64I-NEXT:    slli a6, a6, 8
@@ -244,25 +244,25 @@ define void @lshr_8bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu a0, 7(a0)
+; RV64I-NEXT:    lb a0, 7(a0)
 ; RV64I-NEXT:    slli a4, a4, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a3, a4, a3
 ; RV64I-NEXT:    or a4, a6, a5
-; RV64I-NEXT:    lbu a5, 0(a1)
-; RV64I-NEXT:    lbu a6, 1(a1)
-; RV64I-NEXT:    lbu t2, 2(a1)
-; RV64I-NEXT:    lbu t3, 3(a1)
+; RV64I-NEXT:    lb a5, 0(a1)
+; RV64I-NEXT:    lb a6, 1(a1)
+; RV64I-NEXT:    lb t2, 2(a1)
+; RV64I-NEXT:    lb t3, 3(a1)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a7, t0, a7
 ; RV64I-NEXT:    or a0, a0, t1
 ; RV64I-NEXT:    or a5, a6, a5
-; RV64I-NEXT:    lbu a6, 4(a1)
-; RV64I-NEXT:    lbu t0, 5(a1)
-; RV64I-NEXT:    lbu t1, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a6, 4(a1)
+; RV64I-NEXT:    lb t0, 5(a1)
+; RV64I-NEXT:    lb t1, 6(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t3, t3, 8
 ; RV64I-NEXT:    or t2, t3, t2
 ; RV64I-NEXT:    slli t0, t0, 8
@@ -395,25 +395,25 @@ define void @shl_8bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu a0, 7(a0)
+; RV64I-NEXT:    lb a0, 7(a0)
 ; RV64I-NEXT:    slli a4, a4, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a3, a4, a3
 ; RV64I-NEXT:    or a4, a6, a5
-; RV64I-NEXT:    lbu a5, 0(a1)
-; RV64I-NEXT:    lbu a6, 1(a1)
-; RV64I-NEXT:    lbu t2, 2(a1)
-; RV64I-NEXT:    lbu t3, 3(a1)
+; RV64I-NEXT:    lb a5, 0(a1)
+; RV64I-NEXT:    lb a6, 1(a1)
+; RV64I-NEXT:    lb t2, 2(a1)
+; RV64I-NEXT:    lb t3, 3(a1)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a7, t0, a7
 ; RV64I-NEXT:    or a0, a0, t1
 ; RV64I-NEXT:    or a5, a6, a5
-; RV64I-NEXT:    lbu a6, 4(a1)
-; RV64I-NEXT:    lbu t0, 5(a1)
-; RV64I-NEXT:    lbu t1, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a6, 4(a1)
+; RV64I-NEXT:    lb t0, 5(a1)
+; RV64I-NEXT:    lb t1, 6(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t3, t3, 8
 ; RV64I-NEXT:    or t2, t3, t2
 ; RV64I-NEXT:    slli t0, t0, 8
@@ -541,25 +541,25 @@ define void @ashr_8bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu a0, 7(a0)
+; RV64I-NEXT:    lb a0, 7(a0)
 ; RV64I-NEXT:    slli a4, a4, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a3, a4, a3
 ; RV64I-NEXT:    or a4, a6, a5
-; RV64I-NEXT:    lbu a5, 0(a1)
-; RV64I-NEXT:    lbu a6, 1(a1)
-; RV64I-NEXT:    lbu t2, 2(a1)
-; RV64I-NEXT:    lbu t3, 3(a1)
+; RV64I-NEXT:    lb a5, 0(a1)
+; RV64I-NEXT:    lb a6, 1(a1)
+; RV64I-NEXT:    lb t2, 2(a1)
+; RV64I-NEXT:    lb t3, 3(a1)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a7, t0, a7
 ; RV64I-NEXT:    or a0, a0, t1
 ; RV64I-NEXT:    or a5, a6, a5
-; RV64I-NEXT:    lbu a6, 4(a1)
-; RV64I-NEXT:    lbu t0, 5(a1)
-; RV64I-NEXT:    lbu t1, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a6, 4(a1)
+; RV64I-NEXT:    lb t0, 5(a1)
+; RV64I-NEXT:    lb t1, 6(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t3, t3, 8
 ; RV64I-NEXT:    or t2, t3, t2
 ; RV64I-NEXT:    slli t0, t0, 8
@@ -695,7 +695,7 @@ define void @lshr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -707,7 +707,7 @@ define void @lshr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -729,7 +729,7 @@ define void @lshr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1028,7 +1028,7 @@ define void @lshr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -1040,7 +1040,7 @@ define void @lshr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1062,7 +1062,7 @@ define void @lshr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1361,7 +1361,7 @@ define void @shl_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -1373,7 +1373,7 @@ define void @shl_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1395,7 +1395,7 @@ define void @shl_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1690,7 +1690,7 @@ define void @shl_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) nounw
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -1702,7 +1702,7 @@ define void @shl_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) nounw
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1724,7 +1724,7 @@ define void @shl_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) nounw
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2020,7 +2020,7 @@ define void @ashr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -2032,7 +2032,7 @@ define void @ashr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2054,7 +2054,7 @@ define void @ashr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2353,7 +2353,7 @@ define void @ashr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -2365,7 +2365,7 @@ define void @ashr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2387,7 +2387,7 @@ define void @ashr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2697,7 +2697,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -2705,7 +2705,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu s0, 12(a0)
 ; RV64I-NEXT:    lbu s1, 13(a0)
 ; RV64I-NEXT:    lbu s2, 14(a0)
-; RV64I-NEXT:    lbu s3, 15(a0)
+; RV64I-NEXT:    lb s3, 15(a0)
 ; RV64I-NEXT:    lbu s4, 16(a0)
 ; RV64I-NEXT:    lbu s5, 17(a0)
 ; RV64I-NEXT:    lbu s6, 18(a0)
@@ -2719,7 +2719,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu s8, 20(a0)
 ; RV64I-NEXT:    lbu s9, 21(a0)
 ; RV64I-NEXT:    lbu s10, 22(a0)
-; RV64I-NEXT:    lbu s11, 23(a0)
+; RV64I-NEXT:    lb s11, 23(a0)
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
 ; RV64I-NEXT:    slli t6, t6, 8
@@ -2741,7 +2741,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu s2, 28(a0)
 ; RV64I-NEXT:    lbu s3, 29(a0)
 ; RV64I-NEXT:    lbu s4, 30(a0)
-; RV64I-NEXT:    lbu a0, 31(a0)
+; RV64I-NEXT:    lb a0, 31(a0)
 ; RV64I-NEXT:    slli s9, s9, 8
 ; RV64I-NEXT:    slli s11, s11, 8
 ; RV64I-NEXT:    slli t6, t6, 8
@@ -2763,7 +2763,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a0, 4(a1)
 ; RV64I-NEXT:    lbu s1, 5(a1)
 ; RV64I-NEXT:    lbu s4, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli s8, s8, 8
 ; RV64I-NEXT:    or s7, s8, s7
 ; RV64I-NEXT:    slli s1, s1, 8
@@ -3621,7 +3621,7 @@ define void @lshr_32bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -3629,7 +3629,7 @@ define void @lshr_32bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu s0, 12(a0)
 ; RV64I-NEXT:    lbu s1, 13(a0)
 ; RV64I-NEXT:    lbu s2, 14(a0)
-; RV64I-NEXT:    lbu s3, 15(a0)
+; RV64I-NEXT:    lb s3, 15(a0)
 ; RV64I-NEXT:    lbu s4, 16(a0)
 ; RV64I-NEXT:    ...
[truncated]

llvmbot · 2025-06-18T13:47:55Z

@llvm/pr-subscribers-llvm-globalisel

Author: Alex Bradbury (asb)

Changes

This is currently implemented as part of RISCVOptWInstrs in order to reuse hasAllNbitUsers. However a new home or further refactoring will be needed (see the end of this note).

Why prefer sign-extended loads?

Sign-extending loads are more compressible. There is no compressed LWU, and LBU and LHU are only available in Zcb.
Helps to minimise the diff vs RV32 (e.g. LWU vs LW)
Helps to minimise distracting diffs vs GCC. I see this come up frequently when comparing GCC code and in these cases it's a red herring.

Issues or open questions with this patch as stands:

Doing something at the MI level makes sense as a last resort. I wonder if for some of the cases we could be producing a sign-extended load earlier on.
RISCVOptWInstrs is a slightly awkward home. It's currently not run for RV32, which means the LBU/LHU changes can actually add new diffs vs RV32. Potentially just this one transformation could be done for RV32.
Do we want to perform additional load narrowing? With the existing code, an LD will be narrowed to LW if only the lower bits are needed. The example that made look at extending load normalisation in the first place was an LWU that immediately has a small mask applied, which could be narrowed to LB. It's not clear this has any real benefit.

This patch changes 95k instructions across a compile of llvm-test-suite (including SPEC 2017), and all tests complete successfully afterwards.

Patch is 468.27 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/144703.diff

58 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp (+45)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll (+5-11)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll (+5-11)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll (+96-96)
(modified) llvm/test/CodeGen/RISCV/atomic-signext.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/bf16-promote.ll (+30-15)
(modified) llvm/test/CodeGen/RISCV/bfloat-convert.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/bfloat.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/ctz_zero_return_test.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/double-convert-strict.ll (+6-12)
(modified) llvm/test/CodeGen/RISCV/double-convert.ll (+6-12)
(modified) llvm/test/CodeGen/RISCV/float-convert-strict.ll (+10-22)
(modified) llvm/test/CodeGen/RISCV/float-convert.ll (+10-22)
(modified) llvm/test/CodeGen/RISCV/fold-mem-offset.ll (+132-65)
(modified) llvm/test/CodeGen/RISCV/half-arith.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/half-convert-strict.ll (+15-27)
(modified) llvm/test/CodeGen/RISCV/half-convert.ll (+21-39)
(modified) llvm/test/CodeGen/RISCV/hoist-global-addr-base.ll (+15-7)
(modified) llvm/test/CodeGen/RISCV/local-stack-slot-allocation.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/mem64.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/memcmp-optsize.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/memcmp.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/memcpy-inline.ll (+94-94)
(modified) llvm/test/CodeGen/RISCV/memcpy.ll (+32-32)
(modified) llvm/test/CodeGen/RISCV/memmove.ll (+35-35)
(modified) llvm/test/CodeGen/RISCV/nontemporal.ll (+160-160)
(modified) llvm/test/CodeGen/RISCV/prefer-w-inst.mir (+2-2)
(modified) llvm/test/CodeGen/RISCV/rv64zbb.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rv64zbkb.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rvv/expandload.ll (+512-512)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vector-i8-index-cornercase.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll (+33-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll (+97-97)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-vrgather.ll (+32-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-mask-load-store.ll (-23)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+87-87)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int.ll (+17-8)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll (+5-5)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-vpload.ll (+2-7)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-unaligned.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwaddu.ll (+31-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwmulsu.ll (+31-15)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwmulu.ll (-14)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwsubu.ll (+35-17)
(modified) llvm/test/CodeGen/RISCV/rvv/memcpy-inline.ll (+13-13)
(modified) llvm/test/CodeGen/RISCV/rvv/stores-of-loads-merging.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/rvv/strided-vpload.ll (+15-6)
(modified) llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll (+3-3)
(modified) llvm/test/CodeGen/RISCV/stack-clash-prologue.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/unaligned-load-store.ll (+13-13)
(modified) llvm/test/CodeGen/RISCV/urem-seteq-illegal-types.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/urem-vector-lkk.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-by-byte-multiple-legalization.ll (+132-132)
(modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-legalization.ll (+72-72)
(modified) llvm/test/CodeGen/RISCV/zdinx-boundary-check.ll (+3-3)

diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index ed61236415ccf..a141bae55b70f 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -71,6 +71,8 @@ class RISCVOptWInstrs : public MachineFunctionPass {
                       const RISCVSubtarget &ST, MachineRegisterInfo &MRI);
   bool appendWSuffixes(MachineFunction &MF, const RISCVInstrInfo &TII,
                        const RISCVSubtarget &ST, MachineRegisterInfo &MRI);
+  bool convertZExtLoads(MachineFunction &MF, const RISCVInstrInfo &TII,
+                       const RISCVSubtarget &ST, MachineRegisterInfo &MRI);
 
   void getAnalysisUsage(AnalysisUsage &AU) const override {
     AU.setPreservesCFG();
@@ -788,6 +790,47 @@ bool RISCVOptWInstrs::appendWSuffixes(MachineFunction &MF,
   return MadeChange;
 }
 
+bool RISCVOptWInstrs::convertZExtLoads(MachineFunction &MF,
+                                      const RISCVInstrInfo &TII,
+                                      const RISCVSubtarget &ST,
+                                      MachineRegisterInfo &MRI) {
+  bool MadeChange = false;
+  for (MachineBasicBlock &MBB : MF) {
+    for (MachineInstr &MI : MBB) {
+      unsigned WOpc;
+      int UsersWidth;
+      switch (MI.getOpcode()) {
+      default:
+        continue;
+      case RISCV::LBU:
+        WOpc = RISCV::LB;
+        UsersWidth = 8;
+        break;
+      case RISCV::LHU:
+        WOpc = RISCV::LH;
+        UsersWidth = 16;
+        break;
+      case RISCV::LWU:
+        WOpc = RISCV::LW;
+        UsersWidth = 32;
+        break;
+      }
+
+      if (hasAllNBitUsers(MI, ST, MRI, UsersWidth)) {
+        LLVM_DEBUG(dbgs() << "Replacing " << MI);
+        MI.setDesc(TII.get(WOpc));
+        MI.clearFlag(MachineInstr::MIFlag::NoSWrap);
+        MI.clearFlag(MachineInstr::MIFlag::NoUWrap);
+        MI.clearFlag(MachineInstr::MIFlag::IsExact);
+        LLVM_DEBUG(dbgs() << "     with " << MI);
+        MadeChange = true;
+      }
+    }
+  }
+
+  return MadeChange;
+}
+
 bool RISCVOptWInstrs::runOnMachineFunction(MachineFunction &MF) {
   if (skipFunction(MF.getFunction()))
     return false;
@@ -808,5 +851,7 @@ bool RISCVOptWInstrs::runOnMachineFunction(MachineFunction &MF) {
   if (ST.preferWInst())
     MadeChange |= appendWSuffixes(MF, TII, ST, MRI);
 
+  MadeChange |= convertZExtLoads(MF, TII, ST, MRI);
+
   return MadeChange;
 }
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll b/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll
index a49e94f4bc910..620c5ecc6c1e7 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/double-convert.ll
@@ -246,17 +246,11 @@ define double @fcvt_d_wu(i32 %a) nounwind {
 }
 
 define double @fcvt_d_wu_load(ptr %p) nounwind {
-; RV32IFD-LABEL: fcvt_d_wu_load:
-; RV32IFD:       # %bb.0:
-; RV32IFD-NEXT:    lw a0, 0(a0)
-; RV32IFD-NEXT:    fcvt.d.wu fa0, a0
-; RV32IFD-NEXT:    ret
-;
-; RV64IFD-LABEL: fcvt_d_wu_load:
-; RV64IFD:       # %bb.0:
-; RV64IFD-NEXT:    lwu a0, 0(a0)
-; RV64IFD-NEXT:    fcvt.d.wu fa0, a0
-; RV64IFD-NEXT:    ret
+; CHECKIFD-LABEL: fcvt_d_wu_load:
+; CHECKIFD:       # %bb.0:
+; CHECKIFD-NEXT:    lw a0, 0(a0)
+; CHECKIFD-NEXT:    fcvt.d.wu fa0, a0
+; CHECKIFD-NEXT:    ret
 ;
 ; RV32I-LABEL: fcvt_d_wu_load:
 ; RV32I:       # %bb.0:
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll b/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll
index fa093623dd6f8..bbea7929a304e 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/float-convert.ll
@@ -232,17 +232,11 @@ define float @fcvt_s_wu(i32 %a) nounwind {
 }
 
 define float @fcvt_s_wu_load(ptr %p) nounwind {
-; RV32IF-LABEL: fcvt_s_wu_load:
-; RV32IF:       # %bb.0:
-; RV32IF-NEXT:    lw a0, 0(a0)
-; RV32IF-NEXT:    fcvt.s.wu fa0, a0
-; RV32IF-NEXT:    ret
-;
-; RV64IF-LABEL: fcvt_s_wu_load:
-; RV64IF:       # %bb.0:
-; RV64IF-NEXT:    lwu a0, 0(a0)
-; RV64IF-NEXT:    fcvt.s.wu fa0, a0
-; RV64IF-NEXT:    ret
+; CHECKIF-LABEL: fcvt_s_wu_load:
+; CHECKIF:       # %bb.0:
+; CHECKIF-NEXT:    lw a0, 0(a0)
+; CHECKIF-NEXT:    fcvt.s.wu fa0, a0
+; CHECKIF-NEXT:    ret
 ;
 ; RV32I-LABEL: fcvt_s_wu_load:
 ; RV32I:       # %bb.0:
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll
index 9690302552090..65838f51fc920 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll
@@ -748,7 +748,7 @@ define signext i32 @ctpop_i32_load(ptr %p) nounwind {
 ;
 ; RV64ZBB-LABEL: ctpop_i32_load:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    lwu a0, 0(a0)
+; RV64ZBB-NEXT:    lw a0, 0(a0)
 ; RV64ZBB-NEXT:    cpopw a0, a0
 ; RV64ZBB-NEXT:    ret
   %a = load i32, ptr %p
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll
index cd59c9e01806d..ba058ca0b500a 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll
@@ -114,7 +114,7 @@ define i64 @pack_i64_2(i32 signext %a, i32 signext %b) nounwind {
 define i64 @pack_i64_3(ptr %0, ptr %1) {
 ; RV64I-LABEL: pack_i64_3:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    lwu a0, 0(a0)
+; RV64I-NEXT:    lw a0, 0(a0)
 ; RV64I-NEXT:    lwu a1, 0(a1)
 ; RV64I-NEXT:    slli a0, a0, 32
 ; RV64I-NEXT:    or a0, a0, a1
@@ -122,8 +122,8 @@ define i64 @pack_i64_3(ptr %0, ptr %1) {
 ;
 ; RV64ZBKB-LABEL: pack_i64_3:
 ; RV64ZBKB:       # %bb.0:
-; RV64ZBKB-NEXT:    lwu a0, 0(a0)
-; RV64ZBKB-NEXT:    lwu a1, 0(a1)
+; RV64ZBKB-NEXT:    lw a0, 0(a0)
+; RV64ZBKB-NEXT:    lw a1, 0(a1)
 ; RV64ZBKB-NEXT:    pack a0, a1, a0
 ; RV64ZBKB-NEXT:    ret
   %3 = load i32, ptr %0, align 4
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll b/llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll
index 69519c00f88ea..27c6d0240f987 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll
@@ -8,13 +8,13 @@ define void @lshr_4bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a3, 1(a0)
 ; RV64I-NEXT:    lbu a4, 0(a0)
 ; RV64I-NEXT:    lbu a5, 2(a0)
-; RV64I-NEXT:    lbu a0, 3(a0)
+; RV64I-NEXT:    lb a0, 3(a0)
 ; RV64I-NEXT:    slli a3, a3, 8
 ; RV64I-NEXT:    or a3, a3, a4
 ; RV64I-NEXT:    lbu a4, 0(a1)
 ; RV64I-NEXT:    lbu a6, 1(a1)
 ; RV64I-NEXT:    lbu a7, 2(a1)
-; RV64I-NEXT:    lbu a1, 3(a1)
+; RV64I-NEXT:    lb a1, 3(a1)
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    or a0, a0, a5
 ; RV64I-NEXT:    slli a6, a6, 8
@@ -85,13 +85,13 @@ define void @shl_4bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a3, 1(a0)
 ; RV64I-NEXT:    lbu a4, 0(a0)
 ; RV64I-NEXT:    lbu a5, 2(a0)
-; RV64I-NEXT:    lbu a0, 3(a0)
+; RV64I-NEXT:    lb a0, 3(a0)
 ; RV64I-NEXT:    slli a3, a3, 8
 ; RV64I-NEXT:    or a3, a3, a4
 ; RV64I-NEXT:    lbu a4, 0(a1)
 ; RV64I-NEXT:    lbu a6, 1(a1)
 ; RV64I-NEXT:    lbu a7, 2(a1)
-; RV64I-NEXT:    lbu a1, 3(a1)
+; RV64I-NEXT:    lb a1, 3(a1)
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    or a0, a0, a5
 ; RV64I-NEXT:    slli a6, a6, 8
@@ -162,13 +162,13 @@ define void @ashr_4bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a3, 1(a0)
 ; RV64I-NEXT:    lbu a4, 0(a0)
 ; RV64I-NEXT:    lbu a5, 2(a0)
-; RV64I-NEXT:    lbu a0, 3(a0)
+; RV64I-NEXT:    lb a0, 3(a0)
 ; RV64I-NEXT:    slli a3, a3, 8
 ; RV64I-NEXT:    or a3, a3, a4
 ; RV64I-NEXT:    lbu a4, 0(a1)
 ; RV64I-NEXT:    lbu a6, 1(a1)
 ; RV64I-NEXT:    lbu a7, 2(a1)
-; RV64I-NEXT:    lbu a1, 3(a1)
+; RV64I-NEXT:    lb a1, 3(a1)
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    or a0, a0, a5
 ; RV64I-NEXT:    slli a6, a6, 8
@@ -244,25 +244,25 @@ define void @lshr_8bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu a0, 7(a0)
+; RV64I-NEXT:    lb a0, 7(a0)
 ; RV64I-NEXT:    slli a4, a4, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a3, a4, a3
 ; RV64I-NEXT:    or a4, a6, a5
-; RV64I-NEXT:    lbu a5, 0(a1)
-; RV64I-NEXT:    lbu a6, 1(a1)
-; RV64I-NEXT:    lbu t2, 2(a1)
-; RV64I-NEXT:    lbu t3, 3(a1)
+; RV64I-NEXT:    lb a5, 0(a1)
+; RV64I-NEXT:    lb a6, 1(a1)
+; RV64I-NEXT:    lb t2, 2(a1)
+; RV64I-NEXT:    lb t3, 3(a1)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a7, t0, a7
 ; RV64I-NEXT:    or a0, a0, t1
 ; RV64I-NEXT:    or a5, a6, a5
-; RV64I-NEXT:    lbu a6, 4(a1)
-; RV64I-NEXT:    lbu t0, 5(a1)
-; RV64I-NEXT:    lbu t1, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a6, 4(a1)
+; RV64I-NEXT:    lb t0, 5(a1)
+; RV64I-NEXT:    lb t1, 6(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t3, t3, 8
 ; RV64I-NEXT:    or t2, t3, t2
 ; RV64I-NEXT:    slli t0, t0, 8
@@ -395,25 +395,25 @@ define void @shl_8bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu a0, 7(a0)
+; RV64I-NEXT:    lb a0, 7(a0)
 ; RV64I-NEXT:    slli a4, a4, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a3, a4, a3
 ; RV64I-NEXT:    or a4, a6, a5
-; RV64I-NEXT:    lbu a5, 0(a1)
-; RV64I-NEXT:    lbu a6, 1(a1)
-; RV64I-NEXT:    lbu t2, 2(a1)
-; RV64I-NEXT:    lbu t3, 3(a1)
+; RV64I-NEXT:    lb a5, 0(a1)
+; RV64I-NEXT:    lb a6, 1(a1)
+; RV64I-NEXT:    lb t2, 2(a1)
+; RV64I-NEXT:    lb t3, 3(a1)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a7, t0, a7
 ; RV64I-NEXT:    or a0, a0, t1
 ; RV64I-NEXT:    or a5, a6, a5
-; RV64I-NEXT:    lbu a6, 4(a1)
-; RV64I-NEXT:    lbu t0, 5(a1)
-; RV64I-NEXT:    lbu t1, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a6, 4(a1)
+; RV64I-NEXT:    lb t0, 5(a1)
+; RV64I-NEXT:    lb t1, 6(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t3, t3, 8
 ; RV64I-NEXT:    or t2, t3, t2
 ; RV64I-NEXT:    slli t0, t0, 8
@@ -541,25 +541,25 @@ define void @ashr_8bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu a0, 7(a0)
+; RV64I-NEXT:    lb a0, 7(a0)
 ; RV64I-NEXT:    slli a4, a4, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a3, a4, a3
 ; RV64I-NEXT:    or a4, a6, a5
-; RV64I-NEXT:    lbu a5, 0(a1)
-; RV64I-NEXT:    lbu a6, 1(a1)
-; RV64I-NEXT:    lbu t2, 2(a1)
-; RV64I-NEXT:    lbu t3, 3(a1)
+; RV64I-NEXT:    lb a5, 0(a1)
+; RV64I-NEXT:    lb a6, 1(a1)
+; RV64I-NEXT:    lb t2, 2(a1)
+; RV64I-NEXT:    lb t3, 3(a1)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli a0, a0, 8
 ; RV64I-NEXT:    slli a6, a6, 8
 ; RV64I-NEXT:    or a7, t0, a7
 ; RV64I-NEXT:    or a0, a0, t1
 ; RV64I-NEXT:    or a5, a6, a5
-; RV64I-NEXT:    lbu a6, 4(a1)
-; RV64I-NEXT:    lbu t0, 5(a1)
-; RV64I-NEXT:    lbu t1, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a6, 4(a1)
+; RV64I-NEXT:    lb t0, 5(a1)
+; RV64I-NEXT:    lb t1, 6(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t3, t3, 8
 ; RV64I-NEXT:    or t2, t3, t2
 ; RV64I-NEXT:    slli t0, t0, 8
@@ -695,7 +695,7 @@ define void @lshr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -707,7 +707,7 @@ define void @lshr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -729,7 +729,7 @@ define void @lshr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1028,7 +1028,7 @@ define void @lshr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -1040,7 +1040,7 @@ define void @lshr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1062,7 +1062,7 @@ define void @lshr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1361,7 +1361,7 @@ define void @shl_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -1373,7 +1373,7 @@ define void @shl_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1395,7 +1395,7 @@ define void @shl_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1690,7 +1690,7 @@ define void @shl_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) nounw
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -1702,7 +1702,7 @@ define void @shl_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) nounw
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -1724,7 +1724,7 @@ define void @shl_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) nounw
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2020,7 +2020,7 @@ define void @ashr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -2032,7 +2032,7 @@ define void @ashr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2054,7 +2054,7 @@ define void @ashr_16bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2353,7 +2353,7 @@ define void @ashr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -2365,7 +2365,7 @@ define void @ashr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a5, 12(a0)
 ; RV64I-NEXT:    lbu a6, 13(a0)
 ; RV64I-NEXT:    lbu s0, 14(a0)
-; RV64I-NEXT:    lbu a0, 15(a0)
+; RV64I-NEXT:    lb a0, 15(a0)
 ; RV64I-NEXT:    slli t0, t0, 8
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2387,7 +2387,7 @@ define void @ashr_16bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu t3, 4(a1)
 ; RV64I-NEXT:    lbu t4, 5(a1)
 ; RV64I-NEXT:    lbu s0, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli t6, t6, 8
 ; RV64I-NEXT:    or t5, t6, t5
 ; RV64I-NEXT:    slli t4, t4, 8
@@ -2697,7 +2697,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -2705,7 +2705,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu s0, 12(a0)
 ; RV64I-NEXT:    lbu s1, 13(a0)
 ; RV64I-NEXT:    lbu s2, 14(a0)
-; RV64I-NEXT:    lbu s3, 15(a0)
+; RV64I-NEXT:    lb s3, 15(a0)
 ; RV64I-NEXT:    lbu s4, 16(a0)
 ; RV64I-NEXT:    lbu s5, 17(a0)
 ; RV64I-NEXT:    lbu s6, 18(a0)
@@ -2719,7 +2719,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu s8, 20(a0)
 ; RV64I-NEXT:    lbu s9, 21(a0)
 ; RV64I-NEXT:    lbu s10, 22(a0)
-; RV64I-NEXT:    lbu s11, 23(a0)
+; RV64I-NEXT:    lb s11, 23(a0)
 ; RV64I-NEXT:    slli t2, t2, 8
 ; RV64I-NEXT:    slli t4, t4, 8
 ; RV64I-NEXT:    slli t6, t6, 8
@@ -2741,7 +2741,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu s2, 28(a0)
 ; RV64I-NEXT:    lbu s3, 29(a0)
 ; RV64I-NEXT:    lbu s4, 30(a0)
-; RV64I-NEXT:    lbu a0, 31(a0)
+; RV64I-NEXT:    lb a0, 31(a0)
 ; RV64I-NEXT:    slli s9, s9, 8
 ; RV64I-NEXT:    slli s11, s11, 8
 ; RV64I-NEXT:    slli t6, t6, 8
@@ -2763,7 +2763,7 @@ define void @lshr_32bytes(ptr %src.ptr, ptr %byteOff.ptr, ptr %dst) nounwind {
 ; RV64I-NEXT:    lbu a0, 4(a1)
 ; RV64I-NEXT:    lbu s1, 5(a1)
 ; RV64I-NEXT:    lbu s4, 6(a1)
-; RV64I-NEXT:    lbu a1, 7(a1)
+; RV64I-NEXT:    lb a1, 7(a1)
 ; RV64I-NEXT:    slli s8, s8, 8
 ; RV64I-NEXT:    or s7, s8, s7
 ; RV64I-NEXT:    slli s1, s1, 8
@@ -3621,7 +3621,7 @@ define void @lshr_32bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu a7, 4(a0)
 ; RV64I-NEXT:    lbu t0, 5(a0)
 ; RV64I-NEXT:    lbu t1, 6(a0)
-; RV64I-NEXT:    lbu t2, 7(a0)
+; RV64I-NEXT:    lb t2, 7(a0)
 ; RV64I-NEXT:    lbu t3, 8(a0)
 ; RV64I-NEXT:    lbu t4, 9(a0)
 ; RV64I-NEXT:    lbu t5, 10(a0)
@@ -3629,7 +3629,7 @@ define void @lshr_32bytes_wordOff(ptr %src.ptr, ptr %wordOff.ptr, ptr %dst) noun
 ; RV64I-NEXT:    lbu s0, 12(a0)
 ; RV64I-NEXT:    lbu s1, 13(a0)
 ; RV64I-NEXT:    lbu s2, 14(a0)
-; RV64I-NEXT:    lbu s3, 15(a0)
+; RV64I-NEXT:    lb s3, 15(a0)
 ; RV64I-NEXT:    lbu s4, 16(a0)
 ; RV64I-NEXT:    ...
[truncated]

github-actions · 2025-06-18T13:50:50Z

✅ With the latest revision this PR passed the C/C++ code formatter.

topperc · 2025-06-18T14:38:16Z

If I remember right, there is no c.lb in Zcb. Does this make compression worse?

asb · 2025-06-18T14:50:25Z

If I remember right, there is no c.lb in Zcb. Does this make compression worse?

You're right, I'll update to exclude LB.

EDIT: Now pushed, and patch description updated.

topperc · 2025-06-18T19:02:47Z

Have you look at doing this during instruction selection using the hasAllNbitUsers in RISCVISelDAGToDAG.cpp. That isn't restricted to RV64. Though it won't work across basic blocks.

llvm/test/CodeGen/RISCV/bf16-promote.ll

topperc · 2025-06-18T22:52:44Z

I think changing LWU->LW is useful for compression and minimizing RV32/RV64 delta.

I'm skeptical that LWU->LW and LHU->LH will help us match gcc better. Here are trivial examples where LLVM used LH/LW and gcc used LHU/LWU. https://godbolt.org/z/cbje4aT7P I guess it could be better on average, but it certainly doesn't guarantee a match to gcc.

asb · 2025-06-19T14:47:34Z

I think changing LWU->LW is useful for compression and minimizing RV32/RV64 delta.

Agreed.

I'm skeptical that LWU->LW and LHU->LH will help us match gcc better. Here are trivial examples where LLVM used LH/LW and gcc used LHU/LWU. https://godbolt.org/z/cbje4aT7P I guess it could be better on average, but it certainly doesn't guarantee a match to gcc.

That may be true. I spotted this looking at how we had some LWU that GCC didn't in a workload and wondering if it was an indicator of us overall making some different/worse choices in terms of sign/zero extension (and of course found it was just a case where either was equivalent), and I'm sure I've seen the same before. But I totally believe there are other cases where we make different choices. I'll quantify how often it kicks in, but I probably wouldn't mind dropping LHU->LH. Perhaps the argument for this kind of change is more just "canonicalisation".

Have you look at doing this during instruction selection using the hasAllNbitUsers in RISCVISelDAGToDAG.cpp

I'll try that and report back.

asb · 2025-06-25T14:17:02Z

Have you look at doing this during instruction selection using the hasAllNbitUsers in RISCVISelDAGToDAG.cpp

I'm limiting scope to just LWU => LW for now (there perhaps isn't much of an argument for LHU => LH beyond perhaps picking a "canonical form). I implemented this at ISel with the following patch:

--- a/llvm/lib/Target/RISCV/GISel/RISCVInstructionSelector.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVInstructionSelector.cpp
@@ -208,7 +208,8 @@ bool RISCVInstructionSelector::hasAllNBitUsers(const MachineInstr &MI,
           MI.getOpcode() == TargetOpcode::G_AND ||
           MI.getOpcode() == TargetOpcode::G_OR ||
           MI.getOpcode() == TargetOpcode::G_XOR ||
-          MI.getOpcode() == TargetOpcode::G_SEXT_INREG || Depth != 0) &&
+          MI.getOpcode() == TargetOpcode::G_SEXT_INREG ||
+          MI.getOpcode() == TargetOpcode::G_ZEXTLOAD || Depth != 0) &&
          "Unexpected opcode");

   if (Depth >= RISCVInstructionSelector::MaxRecursionDepth)
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -3532,7 +3532,8 @@ bool RISCVDAGToDAGISel::hasAllNBitUsers(SDNode *Node, unsigned Bits,
           Node->getOpcode() == ISD::SRL || Node->getOpcode() == ISD::AND ||
           Node->getOpcode() == ISD::OR || Node->getOpcode() == ISD::XOR ||
           Node->getOpcode() == ISD::SIGN_EXTEND_INREG ||
-          isa<ConstantSDNode>(Node) || Depth != 0) &&
+          Node->getOpcode() == ISD::LOAD || isa<ConstantSDNode>(Node) ||
+          Depth != 0) &&
          "Unexpected opcode");

   if (Depth >= SelectionDAG::MaxRecursionDepth)
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -2083,6 +2083,13 @@ class binop_allwusers<SDPatternOperator operator>
   let GISelPredicateCode = [{ return hasAllWUsers(MI); }];
 }

+class unaryop_allwusers<SDPatternOperator operator>
+    : PatFrag<(ops node:$arg), (i64(operator node:$arg)), [{
+  return hasAllWUsers(N);
+}]> {
+  let GISelPredicateCode = [{ return hasAllWUsers(MI); }];
+}
+
 def sexti32_allwusers : PatFrag<(ops node:$src),
                                 (sext_inreg node:$src, i32), [{
   return hasAllWUsers(N);
@@ -2157,6 +2164,7 @@ def : Pat<(or_is_add 33signbits_node:$rs1, simm12:$imm),

 def : LdPat<sextloadi32, LW, i64>;
 def : LdPat<extloadi32, LW, i64>;
+def : LdPat<unaryop_allwusers<zextloadi32>, LW, i64>;
 def : LdPat<zextloadi32, LWU, i64>;
 def : LdPat<load, LD, i64>;

I've found that this is much less effective than the RISCVOptWInstrs change:

Doing it at ISel results in ~4600 instructions changed across the test suite
In RISCVOoptWInstrs (updated to do LWU=>LW only) changes ~20500 instructions
There is no additional benefit in doing both (as you'd expected - but I ran this just to check).

I'm about to push changes to this patch to limit it to just the LWU change. Looking at the diffs between doing it at isel and this way, there are some cases where at first glance I would have thought would have been handled - I'll pick through a couple just to check there's nothing surprising going on.

lenary

LGTM

lenary · 2025-07-03T01:34:57Z

To maybe add some more info to my LGTM: I accept this is probably not the perfect place for this, but it's a place that seems to work.

topperc · 2025-07-03T23:49:51Z

llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp

@@ -808,5 +831,7 @@ bool RISCVOptWInstrs::runOnMachineFunction(MachineFunction &MF) {
  if (ST.preferWInst())
    MadeChange |= appendWSuffixes(MF, TII, ST, MRI);

+  MadeChange |= convertZExtLoads(MF, TII, ST, MRI);


Can we combine stripWSuffixes/appendWSuffixes/convertZExtLoads into a single function that walks over the function once and does the right thing?

asb requested review from preames, topperc and wangpc-pp June 18, 2025 13:47

llvmbot added backend:RISC-V llvm:globalisel labels Jun 18, 2025

Missed formatting

1863ec3

Don't convert LBU to LB

7a5774e

lenary reviewed Jun 18, 2025

View reviewed changes

llvm/test/CodeGen/RISCV/bf16-promote.ll Outdated Show resolved Hide resolved

Leave LHU alone as well (to sstreamline this change)

1489c1b

asb changed the title ~~[RISCV] Switch to sign-extended loads if possible in RISCVOptWInstrs~~ [RISCV] Convert LWU to LW if possible in RISCVOptWInstrs Jun 25, 2025

asb added 2 commits June 25, 2025 15:19

Simplify convertZExtLoads

cc80b32

Add comment

04e2fb0

lenary approved these changes Jul 3, 2025

View reviewed changes

topperc reviewed Jul 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Convert LWU to LW if possible in RISCVOptWInstrs #144703

[RISCV] Convert LWU to LW if possible in RISCVOptWInstrs #144703

Uh oh!

asb commented Jun 18, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

github-actions bot commented Jun 18, 2025 •

edited

Loading

Uh oh!

topperc commented Jun 18, 2025 •

edited

Loading

Uh oh!

asb commented Jun 18, 2025 •

edited

Loading

Uh oh!

topperc commented Jun 18, 2025

Uh oh!

Uh oh!

topperc commented Jun 18, 2025

Uh oh!

asb commented Jun 19, 2025

Uh oh!

asb commented Jun 25, 2025

Uh oh!

lenary left a comment

Uh oh!

lenary commented Jul 3, 2025

Uh oh!

topperc Jul 3, 2025

Uh oh!

Uh oh!

[RISCV] Convert LWU to LW if possible in RISCVOptWInstrs #144703

Are you sure you want to change the base?

[RISCV] Convert LWU to LW if possible in RISCVOptWInstrs #144703

Uh oh!

Conversation

asb commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

github-actions bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topperc commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asb commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topperc commented Jun 18, 2025

Uh oh!

Uh oh!

topperc commented Jun 18, 2025

Uh oh!

asb commented Jun 19, 2025

Uh oh!

asb commented Jun 25, 2025

Uh oh!

lenary left a comment

Choose a reason for hiding this comment

Uh oh!

lenary commented Jul 3, 2025

Uh oh!

topperc Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

asb commented Jun 18, 2025 •

edited

Loading

github-actions bot commented Jun 18, 2025 •

edited

Loading

topperc commented Jun 18, 2025 •

edited

Loading

asb commented Jun 18, 2025 •

edited

Loading